Set up git/ssh, Python, cmdstanpy and cmdstan
Figure 1: A jug of water
A function that can measure the water in a jug.
i.e.
\(p: S \rightarrow [0,1]\) where
Probability functions can describe belief, e.g.
“Definitely B”:
“Not sure if A or B”:
“B a bit more plausible than A”:
Figure 2: A nice soup: here is the recipe
In: facts about a spoonful sample
Out: propositions about a soup population
e.g.
Figure 3: A jug of soup
Statistical inference resulting in a probability.
e.g.
Non-Bayesian inferences:
Bayesian inference produces probabilities, which can be interpreted in terms of information and plausible reasoning.
e.g. “According to the model…”
Bayesian inference is old!
This means
Probabilities decompose nicely:
\[ p(\theta, y) = p(\theta)p(y\mid\hat{y}(\theta)) \]
Regression: measured value noisily depends on the true value e.g. \(y \sim N(\hat{y}, \sigma)\).
Biology experiments often have measurement processes with awkward features. e.g.
Bayesian inference is good at describing these.
Figure 6: plot from https://github.com/teddygroves/baseball
Measurement model:
\(y \sim binomial(K, logit(ability))\)
Gpareto model:
\(ability \sim GPareto(m, k, s)\)
Normal model:
\(ability \sim N(\mu, \tau)\)
Figure 7: From a Stan case study
Information about hares (\(u\)) and lynxes (\(v\)):
\[\begin{align*} \frac{d}{dt}u &= (\alpha - \beta v)u \\ \frac{d}{dt}v &= (-\gamma + \delta u)v \end{align*}\]
i.e. a deterministic function turning \(\alpha\), \(\beta\), \(\gamma\), \(\delta\), \(u(0)\) and \(v(0)\) into \(u(t)\) and \(v(t)\).
\(p(\theta \mid y)\) is easy to evaluate but hard to integrate.
This is bad as we typically want something like
\[ p([salt] < 0.1, spoon=s) \]
which is equivalent to
\[ \int_{0}^{0.1}p([salt], spoon=s)d[salt] \]
\(p(\theta \mid y)\) has one dimension per model parameter.
Figure 8: An image I found online
Strategy:
It (often) works!
We can tell when it doesn’t work!
Box and Tiao (1992, Ch. 1.1) (available from dtu findit) gives a nice explanation of statistical inference in general and why Bayes.
Historical interest:
First get a recent (ideally 3.11+) version of Python This can be very annoying so talk to me if necessary!
Next get used to Python virtual environments.
The method I like is to put the virtual environment in a folder .venv inside the root of my project:
Then to use: Tip: use an ergonomic alias to activate venvs e.g. alias va="source .venv/bin/activate"
First install them:
Now test if they work
Hamiltonian Monte Carlo:
MCMC diagnostics
Stan, cmdstanpy, arviz: